Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 50
Filter
1.
bioRxiv ; 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38352519

ABSTRACT

Generating an accurate and complete genome annotation for an organism is complex because the cells within each tissue can express a unique set of transcript isoforms from a unique set of genes. A comprehensive genome annotation should contain information on what tissues express what transcript isoforms at what level. This tissue-level isoform information can then inform a wide range of research questions as well as experiment designs. Long-read sequencing technology combined with advanced full-length cDNA library preparation methods has now achieved throughput and accuracy where generating these types of annotations is achievable. Here, we show this by generating a genome annotation of the mouse (Mus musculus). We used the nanopore-based R2C2 long-read sequencing method to generate 64 million highly accurate full length cDNA consensus reads - averaging 5.4 million reads per tissue for a dozen tissues. Using the Mandalorion tool we processed these reads to generate the Tissue-level Atlas of Mouse Isoforms (TAMI - available at https://genome.ucsc.edu/s/vollmers/TAMI) which we believe will be a valuable complement to conventional, manually curated reference genome annotations.

2.
bioRxiv ; 2023 Aug 21.
Article in English | MEDLINE | ID: mdl-37662385

ABSTRACT

The sequencing of PCR amplicons is a core application of high-throughput sequencing technology. Using unique molecular identifiers (UMIs), individual amplified molecules can be sequenced to very high accuracy on an Illumina sequencer. However, Illumina sequencers have limited read length and are therefore restricted to sequencing amplicons shorter than 600bp unless using inefficient synthetic long-read approaches. Native long-read sequencers from Pacific Biosciences and Oxford Nanopore Technologies can, using consensus read approaches, match or exceed Illumina quality while achieving much longer read lengths. Using a circularization-based concatemeric consensus sequencing approach (R2C2) paired with UMIs (R2C2+UMI) we show that we can sequence ~550nt antibody heavy-chain (IGH) and ~1500nt 16S amplicons at accuracies up to and exceeding Q50 (<1 error in 100,0000 sequenced bases), which exceeds accuracies of UMI-supported Illumina paired sequencing as well as synthetic long-read approaches.

3.
bioRxiv ; 2023 Jun 12.
Article in English | MEDLINE | ID: mdl-37398362

ABSTRACT

Background: RNA-Seq has brought forth significant discoveries regarding aberrations in RNA processing, implicating these RNA variants in a variety of diseases. Aberrant splicing and single nucleotide variants in RNA have been demonstrated to alter transcript stability, localization, and function. In particular, the upregulation of ADAR, an enzyme which mediates adenosine-to-inosine editing, has been previously linked to an increase in the invasiveness of lung ADC cells and associated with splicing regulation. Despite the functional importance of studying splicing and SNVs, short read RNA-Seq has limited the community's ability to interrogate both forms of RNA variation simultaneously. Results: We employed long-read technology to obtain full-length transcript sequences, elucidating cis-effects of variants on splicing changes at a single molecule level. We have developed a computational workflow that augments FLAIR, a tool that calls isoform models expressed in long-read data, to integrate RNA variant calls with the associated isoforms that bear them. We generated nanopore data with high sequence accuracy of H1975 lung adenocarcinoma cells with and without knockdown of ADAR. We applied our workflow to identify key inosine-isoform associations to help clarify the prominence of ADAR in tumorigenesis. Conclusions: Ultimately, we find that a long-read approach provides valuable insight toward characterizing the relationship between RNA variants and splicing patterns.

4.
Genome Biol ; 24(1): 167, 2023 07 17.
Article in English | MEDLINE | ID: mdl-37461039

ABSTRACT

In this manuscript, we introduce and benchmark Mandalorion v4.1 for the identification and quantification of full-length transcriptome sequencing reads. It further improves upon the already strong performance of Mandalorion v3.6 used in the LRGASP consortium challenge. By processing real and simulated data, we show three main features of Mandalorion: first, Mandalorion-based isoform identification has very high precision and maintains high recall even in the absence of any genome annotation. Second, isoform read counts as quantified by Mandalorion show a high correlation with simulated read counts. Third, isoforms identified by Mandalorion closely reflect the full-length transcriptome sequencing data sets they are based on.


Subject(s)
High-Throughput Nucleotide Sequencing , Transcriptome , Protein Isoforms/genetics , Gene Expression Profiling , Sequence Analysis, RNA
5.
Curr Protoc ; 3(3): e705, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36947693

ABSTRACT

Promoters and the noncoding sequences that drive their function are fundamental aspects of genes that are critical to their regulation. The transcription preinitiation complex binds and assembles on promoters where it facilitates transcription. The transcription start site (TSS) is located downstream of the promoter sequence and is defined as the location in the genome where polymerase begins transcribing DNA into RNA. Knowing the location of TSSs is useful for annotation of genes, identification of non-coding sequences important to gene regulation, detection of alternative TSSs, and understanding of 5' UTR content. Several existing techniques make it possible to accurately identify TSSs, but are often difficult to perform experimentally, require large amounts of input RNA, or are unable to identify a large number of TSSs from a single sample. Many of these protocols take advantage of template switching reverse transcriptases (TSRTs), which reliably place an adaptor at the 5' end of a first strand synthesis of cDNA. Here, we introduce a protocol that exploits TSRT activity combined with rolling circle amplification to identify TSSs with several unique advantages over existing methods. Sequence adaptors are placed on the 5' and 3' end of the full-length cDNA copy of a transcript. A splint compatible with those adaptors is then used to circularize the full-length cDNA. Linear DNA containing concatemers of the cDNA are generated using rolling circle amplification, and a sequencing library is formed by fragmenting the concatemers. This protocol is straightforward to execute, requiring limited bench time with relatively stable reagents. Using extremely low amounts of RNA input, this protocol produces large numbers of accurate, deduplicated TSSs genome wide. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Splint generation Basic Protocol 2: RNA extraction Basic Protocol 3: cDNA synthesis Basic Protocol 4: cDNA circularization and amplification Basic Protocol 5: Library generation.


Subject(s)
DNA , RNA , Base Sequence , DNA, Complementary , Transcription Initiation Site
6.
J Hered ; 114(1): 35-43, 2023 03 16.
Article in English | MEDLINE | ID: mdl-36146896

ABSTRACT

The Javan gibbon, Hylobates moloch, is an endangered gibbon species restricted to the forest remnants of western and central Java, Indonesia, and one of the rarest of the Hylobatidae family. Hylobatids consist of 4 genera (Holoock, Hylobates, Symphalangus, and Nomascus) that are characterized by different numbers of chromosomes, ranging from 38 to 52. The underlying cause of this karyotype plasticity is not entirely understood, at least in part, due to the limited availability of genomic data. Here we present the first scaffold-level assembly for H. moloch using a combination of whole-genome Illumina short reads, 10X Chromium linked reads, PacBio, and Oxford Nanopore long reads and proximity-ligation data. This Hylobates genome represents a valuable new resource for comparative genomics studies in primates.


Subject(s)
Genome , Hylobates , Animals , Hylobates/genetics , Forests , Endangered Species , Indonesia
7.
Genome Res ; 32(11-12): 2092-2106, 2022.
Article in English | MEDLINE | ID: mdl-36351772

ABSTRACT

High-throughput short-read sequencing has taken on a central role in research and diagnostics. Hundreds of different assays take advantage of Illumina short-read sequencers, the predominant short-read sequencing technology available today. Although other short-read sequencing technologies exist, the ubiquity of Illumina sequencers in sequencing core facilities and the high capital costs of these technologies have limited their adoption. Among a new generation of sequencing technologies, Oxford Nanopore Technologies (ONT) holds a unique position because the ONT MinION, an error-prone long-read sequencer, is associated with little to no capital cost. Here we show that we can make short-read Illumina libraries compatible with the ONT MinION by using the rolling circle to concatemeric consensus (R2C2) method to circularize and amplify the short library molecules. This results in longer DNA molecules containing tandem repeats of the original short library molecules. This longer DNA is ideally suited for the ONT MinION, and after sequencing, the tandem repeats in the resulting raw reads can be converted into high-accuracy consensus reads with similar error rates to that of the Illumina MiSeq. We highlight this capability by producing and benchmarking RNA-seq, ChIP-seq, and regular and target-enriched Tn5 libraries. We also explore the use of this approach for rapid evaluation of sequencing library metrics by implementing a real-time analysis workflow.


Subject(s)
Nanopores , Sequence Analysis, DNA/methods , Gene Library , High-Throughput Nucleotide Sequencing/methods , Chromatin Immunoprecipitation Sequencing
8.
Science ; 376(6599): 1333-1338, 2022 06 17.
Article in English | MEDLINE | ID: mdl-35709290

ABSTRACT

Polar bears are susceptible to climate warming because of their dependence on sea ice, which is declining rapidly. We present the first evidence for a genetically distinct and functionally isolated group of polar bears in Southeast Greenland. These bears occupy sea-ice conditions resembling those projected for the High Arctic in the late 21st century, with an annual ice-free period that is >100 days longer than the estimated fasting threshold for the species. Whereas polar bears in most of the Arctic depend on annual sea ice to catch seals, Southeast Greenland bears have a year-round hunting platform in the form of freshwater glacial mélange. This suggests that marine-terminating glaciers, although of limited availability, may serve as previously unrecognized climate refugia. Conservation of Southeast Greenland polar bears, which meet criteria for recognition as the world's 20th polar bear subpopulation, is necessary to preserve the genetic diversity and evolutionary potential of the species.


Subject(s)
Conservation of Natural Resources , Global Warming , Ice Cover , Ursidae , Animals , Arctic Regions , Extinction, Biological , Greenland , Population Dynamics , Seals, Earless
9.
Genome Biol ; 23(1): 47, 2022 02 07.
Article in English | MEDLINE | ID: mdl-35130954

ABSTRACT

High-throughput single-cell analysis today is facilitated by protocols like the 10X Genomics platform or Drop-Seq which generate cDNA pools in which the origin of a transcript is encoded at its 5' or 3' end. Here, we used R2C2 to sequence and demultiplex 12 million full-length cDNA molecules generated by the 10X Genomics platform from ~3000 peripheral blood mononuclear cells. We use these reads, independent from Illumina data, to identify B cell, T cell, and monocyte clusters and generate isoform-level transcriptomes for cells and cell types. Finally, we extract paired adaptive immune receptor sequences unique to each T and B cell.


Subject(s)
Leukocytes, Mononuclear , Single-Cell Analysis , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Protein Isoforms/genetics , Sequence Analysis, RNA/methods
10.
J Biol Chem ; 296: 100784, 2021.
Article in English | MEDLINE | ID: mdl-34000296

ABSTRACT

RNA-seq is routinely used to measure gene expression changes in response to cell perturbation. Genes upregulated or downregulated following some perturbation are designated as genes of interest, and their most expressed isoform(s) would then be selected for follow-up experimentation. However, because of its need to fragment RNA molecules, RNA-seq is limited in its ability to capture gene isoforms and their expression patterns. This lack of isoform-specific data means that isoforms would be selected based on annotation databases that are incomplete, not tissue specific, or do not provide key information on expression levels. As a result, minority or nonexistent isoforms might be selected for follow-up, leading to loss in valuable resources and time. There is therefore a great need to comprehensively identify gene isoforms along with their corresponding levels of expression. Using the long-read nanopore-based R2C2 method, which does not fragment RNA molecules, we generated an Isoform-level transcriptome Atlas of Macrophage Activation that identifies full-length isoforms in primary human monocyte-derived macrophages. Macrophages are critical innate immune cells important for recognizing pathogens through binding of pathogen-associated molecular patterns to toll-like receptors, culminating in the initiation of host defense pathways. We characterized isoforms for most moderately-to-highly expressed genes in resting and toll-like receptor-activated monocyte-derived macrophages, identified isoforms differentially expressed between conditions, and validated these isoforms by RT-qPCR. We compiled these data into a user-friendly data portal within the UCSC Genome Browser (https://genome.ucsc.edu/s/vollmers/IAMA). Our atlas represents a valuable resource for innate immune research, providing unprecedented isoform information for primary human macrophages.


Subject(s)
Macrophage Activation , Transcriptome , Cells, Cultured , Gene Expression Profiling , Humans , Macrophages/immunology , Macrophages/metabolism , Protein Isoforms/genetics
11.
J Hered ; 112(4): 377-384, 2021 07 15.
Article in English | MEDLINE | ID: mdl-33882130

ABSTRACT

The Andean bear is the only extant member of the Tremarctine subfamily and the only extant ursid species to inhabit South America. Here, we present an annotated de novo assembly of a nuclear genome from a captive-born female Andean bear, Mischief, generated using a combination of short and long DNA and RNA reads. Our final assembly has a length of 2.23 Gb, and a scaffold N50 of 21.12 Mb, contig N50 of 23.5 kb, and BUSCO score of 88%. The Andean bear genome will be a useful resource for exploring the complex phylogenetic history of extinct and extant bear species and for future population genetics studies of Andean bears.


Subject(s)
Ursidae , Animals , Cell Nucleus , Female , Genome , Molecular Sequence Annotation , Phylogeny , South America , Ursidae/genetics
12.
Proc Natl Acad Sci U S A ; 118(7)2021 02 16.
Article in English | MEDLINE | ID: mdl-33568531

ABSTRACT

Recent studies have identified thousands of long noncoding RNAs (lncRNAs) in mammalian genomes that regulate gene expression in different biological processes. Although lncRNAs have been identified in a variety of immune cells and implicated in immune response, the biological function and mechanism of the majority remain unexplored, especially in sepsis. Here, we identify a role for a lncRNA-gastric adenocarcinoma predictive long intergenic noncoding RNA (GAPLINC)-previously characterized for its role in cancer, now in the context of innate immunity, macrophages, and LPS-induced endotoxic shock. Transcriptome analysis of macrophages from humans and mice reveals that GAPLINC is a conserved lncRNA that is highly expressed following macrophage differentiation. Upon inflammatory activation, GAPLINC is rapidly down-regulated. Macrophages depleted of GAPLINC display enhanced expression of inflammatory genes at baseline, while overexpression of GAPLINC suppresses this response. Consistent with GAPLINC-depleted cells, Gaplinc knockout mice display enhanced basal levels of inflammatory genes and show resistance to LPS-induced endotoxic shock. Mechanistically, survival is linked to increased levels of nuclear NF-κB in Gaplinc knockout mice that drives basal expression of target genes typically only activated following inflammatory stimulation. We show that this activation of immune response genes prior to LPS challenge leads to decreased blood clot formation, which protects Gaplinc knockout mice from multiorgan failure and death. Together, our results identify a previously unknown function for GAPLINC as a negative regulator of inflammation and uncover a key role for this lncRNA in modulating endotoxic shock.


Subject(s)
Immunity, Innate , Shock, Septic/immunology , Animals , Cells, Cultured , Female , Humans , Lipopolysaccharides/toxicity , Male , Mice , Mice, Inbred C57BL , NF-kappa B/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Shock, Septic/etiology , Shock, Septic/genetics , THP-1 Cells , Transcriptome
13.
Science ; 371(6530)2021 02 12.
Article in English | MEDLINE | ID: mdl-33574182

ABSTRACT

The evolutionarily conserved splicing regulator neuro-oncological ventral antigen 1 (NOVA1) plays a key role in neural development and function. NOVA1 also includes a protein-coding difference between the modern human genome and Neanderthal and Denisovan genomes. To investigate the functional importance of an amino acid change in humans, we reintroduced the archaic allele into human induced pluripotent cells using genome editing and then followed their neural development through cortical organoids. This modification promoted slower development and higher surface complexity in cortical organoids with the archaic version of NOVA1 Moreover, levels of synaptic markers and synaptic protein coassociations correlated with altered electrophysiological properties in organoids expressing the archaic variant. Our results suggest that the human-specific substitution in NOVA1, which is exclusive to modern humans since divergence from Neanderthals, may have had functional consequences for our species' evolution.


Subject(s)
Cerebral Cortex/growth & development , Cerebral Cortex/physiology , Neanderthals/genetics , Neurons/physiology , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Alleles , Alternative Splicing , Amino Acid Substitution , Animals , Binding Sites , Biological Evolution , CRISPR-Cas Systems , Cell Proliferation , Cerebral Cortex/cytology , Gene Expression Regulation, Developmental , Genetic Variation , Genome , Genome, Human , Haplotypes , Hominidae/genetics , Humans , Induced Pluripotent Stem Cells , Nerve Net/physiology , Nerve Tissue Proteins/genetics , Nerve Tissue Proteins/metabolism , Neuro-Oncological Ventral Antigen , Organoids , Synapses/physiology
14.
Cell Rep ; 33(13): 108541, 2020 12 29.
Article in English | MEDLINE | ID: mdl-33378675

ABSTRACT

Macrophages are critical effector cells of the immune system, and understanding genes involved in their viability and function is essential for gaining insights into immune system dysregulation during disease. We use a high-throughput, pooled-based CRISPR-Cas screening approach to identify essential genes required for macrophage viability. In addition, we target 3' UTRs to gain insights into previously unidentified cis-regulatory regions that control these essential genes. Next, using our recently generated nuclear factor κB (NF-κB) reporter line, we perform a fluorescence-activated cell sorting (FACS)-based high-throughput genetic screen and discover a number of previously unidentified positive and negative regulators of the NF-κB pathway. We unravel complexities of the TNF signaling cascade, showing that it can function in an autocrine manner in macrophages to negatively regulate the pathway. Utilizing a single complex library design, we are capable of interrogating various aspects of macrophage biology, thus generating a resource for future studies.


Subject(s)
Flow Cytometry/methods , High-Throughput Screening Assays/methods , Inflammation/genetics , Inflammation/metabolism , Macrophages/physiology , NF-kappa B/physiology , Tumor Necrosis Factor-alpha/physiology , 3' Untranslated Regions , Animals , CRISPR-Cas Systems , Cell Line , Cell Survival , Clustered Regularly Interspaced Short Palindromic Repeats , Gene Expression Regulation , HEK293 Cells , Humans , Mice , RNA, Guide, Kinetoplastida/genetics , Signal Transduction
15.
medRxiv ; 2020 Nov 03.
Article in English | MEDLINE | ID: mdl-33173926

ABSTRACT

Chronic obstructive pulmonary disease (COPD) is a leading cause of death worldwide. Genome-wide association studies (GWAS) have identified over 80 loci that are associated with COPD and emphysema, however for most of these loci the causal variant and gene are unknown. Here, we utilize lung splice quantitative trait loci (sQTL) data from the Genotype-Tissue Expression project (GTEx) and short read sequencing data from the Lung Tissue Research Consortium (LTRC) to characterize a locus in nephronectin ( NPNT ) associated with COPD case-control status and lung function. We found that the rs34712979 variant is associated with alternative splice junction use in NPNT , specifically for the junction connecting the 2nd and 4th exons (chr4:105898001-105927336) (p=4.02×10 -38 ). This association colocalized with GWAS data for COPD and lung spirometry measures with a posterior probability of 94%, indicating that the same causal genetic variants in NPNT underlie the associations with COPD risk, spirometric measures of lung function, and splicing. Investigation of NPNT short read sequencing revealed that rs34712979 creates a cryptic splice acceptor site which results in the inclusion of a 3 nucleotide exon extension, coding for a serine residue near the N-terminus of the protein. Using Oxford Nanopore Technologies (ONT) long read sequencing we identified 13 NPNT isoforms, 6 of which are predicted to be protein coding. Two of these are full length isoforms which differ only in the 3 nucleotide exon extension whose occurrence differs by genotype. Overall, our data indicate that rs34712979 modulates COPD risk and lung function by creating a novel splice acceptor which results in the inclusion of a 3 nucelotide sequence coding for a serine in the nephronectin protein sequence. Our findings implicate NPNT splicing in contributing to COPD risk, and identify a novel serine insertion in the nephronectin protein that warrants further study.

16.
PLoS Genet ; 16(8): e1008935, 2020 08.
Article in English | MEDLINE | ID: mdl-32841233

ABSTRACT

Bacterial symbionts bring a wealth of functions to the associations they participate in, but by doing so, they endanger the genes and genomes underlying these abilities. When bacterial symbionts become obligately associated with their hosts, their genomes are thought to decay towards an organelle-like fate due to decreased homologous recombination and inefficient selection. However, numerous associations exist that counter these expectations, especially in marine environments, possibly due to ongoing horizontal gene flow. Despite extensive theoretical treatment, no empirical study thus far has connected these underlying population genetic processes with long-term evolutionary outcomes. By sampling marine chemosynthetic bacterial-bivalve endosymbioses that range from primarily vertical to strictly horizontal transmission, we tested this canonical theory. We found that transmission mode strongly predicts homologous recombination rates, and that exceedingly low recombination rates are associated with moderate genome degradation in the marine symbionts with nearly strict vertical transmission. Nonetheless, even the most degraded marine endosymbiont genomes are occasionally horizontally transmitted and are much larger than their terrestrial insect symbiont counterparts. Therefore, horizontal transmission and recombination enable efficient natural selection to maintain intermediate symbiont genome sizes and substantial functional genetic variation.


Subject(s)
Bacteria/pathogenicity , Bivalvia/microbiology , Gene Transfer, Horizontal , Genome, Bacterial , Recombination, Genetic , Symbiosis/genetics , Animals , Bacteria/genetics , Bivalvia/genetics , Evolution, Molecular , Genetic Variation
17.
Nucleic Acids Res ; 48(13): e75, 2020 07 27.
Article in English | MEDLINE | ID: mdl-32491177

ABSTRACT

A high quality genome assembly is a vital first step for the study of an organism. Recent advances in technology have made the creation of high quality chromosome scale assemblies feasible and low cost. However, the amount of input DNA needed for an assembly project can be a limiting factor for small organisms or precious samples. Here we demonstrate the feasibility of creating a chromosome scale assembly using a hybrid method for a low input sample, a single outbred Drosophila melanogaster. Our approach combines an Illumina shotgun library, Oxford nanopore long reads, and chromosome conformation capture for long range scaffolding. This single fly genome assembly has a N50 of 26 Mb, a length that encompasses entire chromosome arms, contains 95% of expected single copy orthologs, and a nearly complete assembly of this individual's Wolbachia endosymbiont. The methods described here enable the accurate and complete assembly of genomes from small, field collected organisms as well as precious clinical samples.


Subject(s)
Chromosomes, Bacterial/genetics , Chromosomes, Insect/genetics , Drosophila melanogaster/genetics , Genome, Bacterial/genetics , Genome, Insect/genetics , Wolbachia/genetics , Animals , Genomics/methods
18.
Cell Rep ; 31(8): 107668, 2020 05 26.
Article in English | MEDLINE | ID: mdl-32460011

ABSTRACT

The liver is a key regulator of systemic energy homeostasis whose proper function is dependent on the circadian clock. Here, we show that livers deficient in the oscillator component JARID1a exhibit a dysregulation of genes involved in energy metabolism. Importantly, we find that mice that lack hepatic JARID1a have decreased lean body mass, decreased respiratory exchange ratios, faster production of ketones, and increased glucose production in response to fasting. Finally, we find that JARID1a loss compromises the response of the hepatic transcriptome to nutrient availability. In all, ablation of hepatic JARID1a disrupts the coordination of hepatic metabolic programs with whole-body consequences.


Subject(s)
DNA-Binding Proteins/metabolism , Feeding Behavior/physiology , Jumonji Domain-Containing Histone Demethylases/metabolism , Liver/metabolism , Adaptation, Physiological , Animals , Circadian Rhythm/physiology , DNA-Binding Proteins/deficiency , DNA-Binding Proteins/genetics , Humans , Jumonji Domain-Containing Histone Demethylases/deficiency , Jumonji Domain-Containing Histone Demethylases/genetics , Mice , Mice, Knockout
19.
Genome Res ; 30(4): 589-601, 2020 04.
Article in English | MEDLINE | ID: mdl-32312742

ABSTRACT

The human immune system relies on highly complex and diverse transcripts and the proteins they encode. These include transcripts encoding human leukocyte antigen (HLA) receptors as well as B cell and T cell receptors (BCR and TCR). Determining which alleles an individual possesses for each HLA gene (high-resolution HLA typing) is essential to establish donor-recipient compatibility in organ and bone marrow transplantations. In turn, the repertoires of millions of unique BCR and TCR transcripts in each individual carry a vast amount of health-relevant information. Both short-read RNA-seq-based HLA typing and BCR/TCR repertoire sequencing (AIRR-seq) currently rely on our incomplete knowledge of the genetic diversity at HLA and BCR/TCR loci. Here, we generated over 10,000,000 full-length cDNA sequences at a median accuracy of 97.9% using our nanopore sequencing-based Rolling Circle Amplification to Concatemeric Consensus (R2C2) protocol. We used this data set to (1) show that deep and accurate full-length cDNA sequencing can be used to provide isoform-level transcriptome analysis for more than 9000 loci, (2) generate accurate sequences of HLA alleles, and (3) extract detailed AIRR data for the analysis of the adaptive immune system. The HLA and AIRR analysis approaches we introduce here are untargeted and therefore do not require prior knowledge of the composition or genetic diversity of HLA and BCR/TCR loci.


Subject(s)
DNA, Complementary , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Immune System/cytology , Immune System/metabolism , Transcriptome , Alleles , Alternative Splicing , Female , Gene Expression Profiling/methods , Gene Expression Regulation , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Histocompatibility Testing , Humans , Male , Mutation , Receptors, Immunologic
SELECTION OF CITATIONS
SEARCH DETAIL
...